Data bus between middleware layers

ABSTRACT

A system/method is introduced that integrates middleware components without canonicalization of data at runtime, where the system/method receives inputs identifying at least a first and second middleware to be made interoperative (via a communication path between an in-port corresponding to the first middleware and an out-port corresponding to the second middleware), receives an incoming message at the in-port, handles the received message as a plurality of parts and where, for each part, a data-object is created based on an identified type factory, with the in-port populating the data-object with values from corresponding part of the message and passing the populated data object from the in-port corresponding to the first middleware to the out-port corresponding to the second middleware.

This application is a divisional of pending U.S. application Ser. No.11/307,022, filed Jan. 19, 2006. In addition, this application claimspriority to U.S. Provisional Application 60/644,581 filed Jan. 19, 2005,which is incorporated herein by reference.

BACKGROUND OF THE INVENTION

1. Field of Invention

The present invention relates generally to the field of middlewareintegration. More specifically, the present invention is related tomiddleware integration without introducing canonical data formats,message formats, or protocols.

2. Discussion of Prior Art

Prior attempts to facilitate the integration of middleware typicallyintroduce at least one of the following: a canonical data format; acanonical message format; a canonical protocol. The use of the term“canonical” would be understood by one of ordinary skill to include,among other features, the making of a physical copy of the informationin the canonical format.

For example, some prior techniques have introduced a middleware bridge.One side of the bridge communicates using, say, middleware A; and theother side uses, say, middleware B. To allow A to communicate with B (inone direction), a mapping from A to the canonical data format has to bedefined, and also a mapping from this canonical format to B. Tocommunicate in the other direction, requires a mapping from B to thecanonical format, and a mapping from the canonical format to A.

If a new middleware, say, C is introduced, then a new mapping can beintroduced from C to the canonical format, and from the canonical formatto C. Combined with the previous mappings, this allows: two-waycommunication between A to C; two-way communication between B and C; andof course two-way communication between A and B.

The number of mappings that need to be introduced is O(n+n), where n isthe number of middleware products/standards that need to be allowedinteroperate. It should be noted that O(x) is “of order x” and O(n²) issubstantially larger than O(n), especially as the value of n increases.Comparing O(n) and O(n+n) is more difficult. At first sight, O(n+n) islarger than O(n), but the proper comparison of these depends on thedetails. A simple cost model for O(n) is W*n+C_(I), where W is the costfor each element of n and C is some constant cost. One O(n) approach tointegration may cost W₁*n+C₁. Another approach, that is O(n+n), may costW₂*(n+n)+C₂. Nevertheless, the former may be larger than the latterbecause of the higher values of W₁ and/or C₁, compared to W₂ and/or C₂.It should also be noted that the value n must also include variations inthe selected middleware. That is, it needs to be increased if amiddleware can support two or more data formats, message formats and/orprotocols.

A disadvantage of this approach is that to communicate, say, from A andB, the data must first be translated from A into the canonical formatand then from the canonical format to B.

In other previous techniques, a canonical middleware is used tointegrate middleware. When, say, middleware A sends a message tomiddleware B, the data is transformed into the data format of acanonical middleware, and the message is transmitted using thatcanonical middleware's message format and protocol. In order tointroduce a third middleware, say, C, a mapping from C to the canonicalmiddleware and from the canonical middleware to C, must be defined inorder to facilitate two-way communication to/from C. The number ofmappings is O(n+n), where n is the number of middlewareproducts/standards. Again, the value n must include variations in theselected middleware. A disadvantage of this approach is that tocommunicate, say, from A and B, that data and messages need to betranslated from the data and message formats and protocols of A into thecanonical data and message formats and protocols, and from these to thedata and message formats and protocols of B.

Another approach to the integration of middleware is to write directintegration connectors between the various middlewares of interest. Forexample, if middleware A, B and C are of interest, then connections A toB, B to A, A to C, C to A, B to C and C to B could be written (orwhatever subset of these that is required). A disadvantage of thisapproach is that the number of connections that is required is O(n²).

Whatever the precise merits, features, and advantages of the above citedtechniques, none of them achieves or fulfills the purposes of thepresent invention.

SUMMARY OF THE INVENTION

Embodiments of the present invention relate to an integration productthat has some special support for the integration of multiplemiddleware. Throughout this document, the integration product is oftenreferred to as “Artix” or some variant thereof. The use of the term“Artix” is for convenience and is not intended to limit the embodimentsof the present invention to the specific features of a particularsoftware product, a particular version of a software application, or asuite of software applications that now, or in the future, have beenmarketed or developed under the label “Artix”.

The different middleware that Artix can integrate often use differentformats for the data that they transmit, different message formats tohold this data, and/or different protocols to transmit these messages.Artix integrates middleware islands without introducing common/shared(sometimes we use the term canonical) data formats, message formats orprotocols. Therefore, there is no need to introduce a canonical dataformat, canonical message format or canonical protocol when Artix isused to:

-   -   facilitate interoperability between a set of applications that        use two or more middlewares,    -   integrate the middleware islands in an enterprise, or    -   to integrate multiple enterprises.

Artix consists of the Artix Bus (which is sometimes enhanced to becomethe Artix Runtime Bus), and other components.

Artix allows middleware to communicate without necessarily introducing acanonical data format, canonical message format or canonical protocol.Where appropriate, any of a canonical data format, canonical messageformat and/or canonical protocol can be introduces if this is desirable.

Artix uses APIs rather than a canonical data format, canonical messageformat or canonical protocol. One of the advantages of the way in whichArtix supports interoperability of middleware is that, despite theability to do this without a canonical data format, message format orprotocol, the amount of work required to support n middlewareproducts/standards can be O(n) rather than O(n²).

Yet another advantage of Artix is that it involves fewer transformationsbetween data formats, message formats and/or protocols. For example, tocommunicate, say, from A to B, data does not have to be transformed fromA's format into a canonical data format and then from the canonicalformat into B's format. Instead, one transformation is all that isrequired: from A's format to B's format. Reducing the number oftransformations give performance and other advantages.

In some sense, Artix may be thought to use a design-time canonicalformat: a canonical format for interface definitions. In oneimplementation of Artix, WSDL is used as the canonical format forinterfaces. In one implementation of Artix, XML Schema is used as thecanonical definition for data that is used by these interfaces asdefined in the canonical format. The configurator (see later) definesmappings between the canonical interface formats (aka the logicalcontract), and the binding for a particular middleware interaction (akathe physical contract). Despite this, Artix avoids canonicalization ofdata at runtime. Thus, the Artix does not utilize traditional“canonicalization” in that a copy of information is not created duringruntime.

It is also possible to write applications using the logical contracts,and these applications do not make direct use of any middleware APIs;yet the marshalling from the application layer to the middleware layeris as efficiently as applications written directly to the middlewarelayer. Thus, the use of the term middleware in describing the featuresof Artix is intended to include more than simply commercial middlewareproducts or functionality. In contrast, as used in relation to Artix,“middleware” can refer to almost any software that generates messages totransport to a recipient.

The present invention provides for a method to integrate a plurality ofmiddleware without canonicalization of data at runtime, said methodimplemented using in-ports and out-ports, each in-port facilitatingcommunication between a specific middleware and a first interface andeach out-port facilitating communication between another specificmiddleware and said first interface, wherein the method comprises thesteps of: (a) receiving inputs identifying at least a first and secondmiddleware to be made interoperative, said interoperation implementedvia at least one communication path between an in-port corresponding tosaid first middleware and an out-port corresponding to said secondmiddleware; (b) receiving an incoming message at said in-port; (c)handling said received message as a plurality of parts and, for eachpart: (1) identifying an associated data type, (2) identifying a typefactory for said identified data type, (3) creating at least onedata-object based on said identified type factory, (4) passing saidcreated data-object to said in-port, said in-port populating saiddata-object with values from corresponding part of said message, and (5)passing said populated data object from the in-port corresponding tosaid first middleware to said out-port corresponding to the secondmiddleware; and (d) iteratively repeating step (c) for each part of saidmessage. An article of manufacture is also disclosed that storescomputer readable program code, which when executed by a computer,implements the above-mentioned method.

The present invention also provides for a method to integrate aplurality of middleware without canonicalization of data at runtime,said method implemented using in-ports and out-ports, each in-portfacilitating communication between a specific middleware and a firstinterface and each out-port facilitating communication between anotherspecific middleware and said first interface, wherein the methodcomprising the steps of: (a) receiving inputs identifying at least afirst and second middleware to be made interoperative, saidinteroperation implemented via at least one communication path betweenan in-port corresponding to said first middleware and an out-portcorresponding to said second middleware; (b) receiving an incomingmessage at said in-port; (c) handling said received message as aplurality of parts and, for each part: (1) identifying an associateddata type, (2) identifying a type factory for said identified data type,(3) creating at least one data-object based on said identified typefactory; (4) passing said created data-object to said in-port, and saidin-port populating said data-object with values from corresponding partof said message; (d) iteratively repeating step (c) for each part ofsaid message and populating a plurality of data-objects; and (e) passingsaid plurality of populated data objects from the in-port correspondingto said first middleware to said out-port corresponding to the secondmiddleware. An article of manufacture is also disclosed that storescomputer readable program code, which when executed by a computer,implements the above-mentioned method.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 illustrates an Artix Bus (or the Artix runtime bus) configuredwith ports to middlewares A, B and C.

FIG. 2 illustrates sophisticated ports for data transformation andruntime routing.

DESCRIPTION OF THE PREFERRED EMBODIMENTS

While this invention is illustrated and described in a preferredembodiment, the invention may be produced in many differentconfigurations. There is depicted in the drawings, and will herein bedescribed in detail, a preferred embodiment of the invention, with theunderstanding that the present disclosure is to be considered as anexemplification of the principles of the invention and the associatedfunctional specifications for its construction and is not intended tolimit the invention to the embodiment illustrated. Those skilled in theart will envision many other possible variations within the scope of thepresent invention.

It should be noted that all through the specification, the term“message” has been used as an example to describe the variousembodiments of the present invention. However, it should be noted thatthe interpretation of the present invention should not be limited bysuch terminology. Other equivalents, such as, block of data orintermediate canonical block of data, can be substituted withoutdeparting from the scope of the present invention.

In each use of Artix, the person who configures Artix (theconfigurator), specifies the middlewares that are in use in the system(more accurately, the middlewares that are to be made interoperate viaArtix), and the interoperability that is required. If, say, middlewaresA, B, C and D are in use, the following interoperability may be requiredin one example installation:

A to B, B to A, A to C, C to A, C to B, B to C and A to D, D to B

The configuration choices are of order O(n2), which increases thechoices for the person configuring Artix. However, as said previously,the work to extend Artix to support n middleware is O(n).

Interoperability that isn't required does not have to be included.

This has a number of advantages, such as:

-   -   no effort is required to specify/program interoperability        connections that aren't required; and    -   at runtime, the size of Artix can be reduced.

FIG. 1 shows an Artix Bus (sometimes called the Artix runtime bus)configured with some ports to middlewares A, B and C. Both an in-portand an out-port are included in the case of middleware A and C, and onlyan out-port is shown for middleware B. Also shown are two paths, in thiscase from A to B, and from A to C. These paths indicate that this ArtixBus can accept calls from middleware A and route them to middlewares Bor C. Two in-ports are shown for A, but the two paths may also be ableto share a single in-port for A.

The ART configuration data, among other things, lists items such aswhen, how and from where to load the ports (e.g., what DLLs they arestored in). The ART configuration also contains information about thethreading configuration (the threading configuration can include itemssuch as: for a client, creating a thread per call or per target entity,or having a thread pool; for a server, creating a thread per call, orper target entity, or having a thread pool; there are many suchconfigurations that would be known to one of ordinary skill in thisfield), implementation interceptors (that enforce policies but shouldnot be visible in the WSDL), and bootstrap information such as pointersto the WSDL that contains the definition of the ports (both physicallyand logically) along with the routing table.

An in-port is a connection between a given middleware and the APIs ofthe Artix Bus that can handle messages (here and elsewhere, the termmessage is used, but various middleware uses different terminology, suchas operation call, for this core concept) coming from that middleware tothe Artix Bus.

An out-port is a connection between a given middleware and the APIs ofthe Artix Bus that can handle messages going from the Artix Bus to thatmiddleware.

One of the benefits of Artix is there can be partial support for amiddleware. For example, for a given middleware there can be an in-portand no out-port, or an out-port and no in-port. An Artix Bus has a fullport for a given middleware if it has both an in-port and an out-portfor it.

In a different configuration of an Artix Bus, some of the ports shown inFIG. 1 may not be configured. For example, the in-port for C may bemissing. New ports could also be added, such as an out-port for A.

Sometimes we use the term connector or connection to refer to an port.We also sometimes use the term route to refer to paths.

Note that the Artix Bus does not have to deal with the data formats,message formats or protocols of the various middlewares: these arehandled by the in and out-ports.

A port is made up of a binding and a transport. The binding deals withthe data formatting (e.g., fixed, SOAP, tagged, and so on) and otheraspects, and the transport deals with the communications protocol ormiddleware (e.g., HTTP, MQ, TCP/IP, IIOP, and so on) and other aspects.These can be mixed and matched: a binding for fixed format message canbe combined with, say, the IIOP protocol/middleware to give one port;while another port can be got from the combination of the fixed bindingand the MQ protocol/middleware.

More Sophisticated Paths

As well as the paths shown in FIG. 1, more sophisticated paths can beconfigured to include steps such as transformation (for example, of thedata, the data format or operations) and routing based on runtime data(such as the content of a message, or some table that can be updated atruntime). Two examples of these are shown in FIG. 2 (a path from A to Bor C, with runtime routing that decides whether to route messages from Ato B or from A to C; a path from A to C that includes a data formattransformation).

A number of approaches can be taken to routing, including the following.

-   -   It can be based on the port number of an incoming message, so        that all messages that arrive on a given port are sent along a        specific path (or part of a path) that is configured in the        routing information,    -   It can be based on the operation name (which may be called other        things, such as a subject, in some middleware), so that all        incoming messages with a given operation name are sent along a        specific path (or part of a path) that is configured in the        routing information.    -   It can be based on the content of a message, so that all        incoming messages with content that fulfils certain criteria are        sent along a specific path (or part of a path) that is        configured in the routing information.    -   Fan-out can be used, whereby messages are sent along multiple        paths (or parts of paths) that are configured in the routing        information.    -   Failover can be used, whereby messages are sent one available        path (or part of a path) from a set of paths (or parts of paths)        that is configured in the routing information. Availability can        be determined in a number of ways, including: the availability        of the final destination of a message; the availability of some        or all of the entities between the router and the final        destination.    -   These and other approaches can be mixed. For example: port and        operation routing can be mixed; port, operation and content        routing can be mixed; failover and fan out can be mixed;        failover and content can be mixed.

Also, the router is a separate component of Artix; it does not have tobe configured to be present at all, thus simplifying the Artixinstallation and removing overhead.

In one embodiment, paths are implemented using interceptor chains on atechnology such as ART. As well as ports (bindings and transports), apath may also contain other elements such as: transformers,runtime-routers, load-balancing elements, fault tolerance elements,“ility” elements, session management elements, orchestration controllers(for example, to implement BPEL), and so on. A path may also containuser-supplied elements (such as those to: enhance security; propagatetransaction information and/or control transaction boundaries; and soon). Interceptors provide an efficient policy-enforcement mechanism, butalso encapsulate certain technical artefacts from the application layer.This is because middleware specific behaviour can be contained in theinterceptors. Interceptors function at either the request level or thetransport level, which provides different views of the request contextthat the interceptor manipulates.

Request level interceptors are able to use the XMLSchema APIs to accessthe request payload as discrete structures and elements, which allowsthem to make decisions based upon, or to alter the content of therequest. The XMLSchema API system prevents them from being tied to thesyntax of a particular message format, without having to physicallycanonicalize the request payload.

Message level interceptors act at the byte-stream view of the requestpayload. This allows them to perform bulk operations such as hashing,code-page conversion, compression, encryption, encoding, and digitalsignatures.

The interceptor chains are assembled by consulting the annotations inthe service contract (e.g., extensors in the WSDL), plus non-contractualimplementation strategy details (e.g., the ART configuration), but thechains can be expanded or reordered programmatically during initialassembly of the chain.

Some of the Ways of Using Artix

Among other ways, Artix can be used in the following five ways (andthese uses can be combined):

-   -   a. As a switch between different middleware. Typically in this        usage, there will be one or more in-ports and one or more        out-ports, with the possibility of having other path elements.    -   b. (Embedded in clients) To allow clients to be built directly        on it. Typically, in this usage, there will be one or more        out-ports (with the possibility of having other path elements),        and a client directly uses the Artix APIs to send messages.    -   c. (Embedded in servers) To allow servers to be built directly        on it. Typically, in this usage, there will be one or more        in-ports (with the possibility of having other path elements),        and a server directly uses the Artix APIs to receive messages.    -   d. (Co-located with clients) This functionality is similar to        the “switch” arrangement but the Artix environment is co-located        on the same machine as the client application.    -   e. (Co-located with the servers) This functionality is similar        to the “switch” arrangement but the Artix environment is        co-located on the same machine as the server application.

The remainder of this section discusses these in more detail. It is alsoexpressly contemplated that more than one Artix environment may beutilized concurrently such that a combination of the above arrangementscan be in place in a particular enterprise architecture.

A switch can be in-process with one of the processes that arecommunicating, or in a different process sitting logically in the middlebetween some communicating processes. In discussing the use of Artix asa “switch” it should be kept in mind that similar behaviour applies tothe last two arrangements described above in which the Artix environmentis co-located with either the client or the server.

The switch can operate in a number of ways, including:

-   -   Traditional hub-and-spoke in which the intermediary accepts all        inputs (i.e., has all in-ports) and can forward messages to all        outputs (i.e., has all out-ports).    -   It can have one in-port, to middleware MI, and many out-ports.        This allows services available on many middlewares (those for        which an out-port is configured) to be made available to        applications that can make calls to applications via MI. For        example, where MI is SOAP/Web-services, this Artix configuration        becomes similar in capability to IBM's WSF.    -   It can have many in-ports and one out-port, MO. This allows        services implemented in a wide range of middlewares (those for        which an in-port is configured) to use the services provided by        applications that can be communicated with via MO. This is        useful in many situations. For example, a well established        service available over middleware MO, can be made available to        further applications without proliferating MO. Instead, it can        be accessed using a range of other middlewares.    -   It can have many in-ports and many out-ports.

As a special case of using Artix, applications can be built directly ontop of it instead of directly on top of a specific middleware. Artixprovides such applications (both clients and servers, as explainedabove) with APIs that allow them to be written independently of themiddleware or middlewares that are actually underneath. A clientapplication can use the Artix APIs to send messages, without concern forwhich the middleware or middlewares are actually used to transmit itsmessages and accept replies/messages coming to it. The middleware ormiddlewares can be changed by reconfiguring Artix. Similarly, a serverapplication can use Artix APIs to receive message, without concern forthe middleware or middlewares that are actually used to accept itsmessages and send replies/messages.

The API used by clients and servers are model driven. They contain noartefacts that reveal the underlying middleware. (In contrast,middleware products offer APIs that reveal the middleware in use. CORBAfunctions, for example, have extra parameters that are of special CORBAtypes, and hence reveal that the functions are making use of CORBA.) TheArtix APIs are generated from the contracts (defined, for example, inWSDL, or defined in some middleware's native interface definitionmechanism and understood by Artix). Artix APIs in JAVA adhere to theJAX-RPC standard; Artix APIs in C++ adhere to the corresponding ObjectManagement Group's (OMG) C++ standard.

This means that users of Artix APIs need not be aware: of whatmiddleware is in use; that any middleware is being used (i.e., the callslook like normal in-process calls); or that Artix is being used.However, Artix APIs normally have extensions (extra functions) thatoffer advanced features/controls, and the use of these does reveal thatArtix is being used, but the choice of underlying middleware is notalways revealed. The configuration choices described in this section andothers are made by configuring the Artix Bus. No programming steps arerequired to carry out this configuration as long as the necessary portsare available. Configuration changes can be made before an Artix Bus isstarted, or after it has been started (either by: stopping it, makingthe changes and re-starting it; or while it is still running).

These choices make it easier to fit Artix into the architecture of theIT system (or part of it) of an enterprise (or group of enterprises),rather than having to modify the architecture of the IT system to suitArtix.

How the Artix Bus works

The Artix Bus can be used to pass data from an in-port to an out-port(and hence, for example, from one middleware to another). It candirectly pass the data across, or it can perform certain actions on thedata, such as transforming it in various ways. The Artix Bus can also beconfigured to route data in various different ways: for example, it canhave a routing table that informs it of what out-port it should sendcertain data to.

In a simple form, the Artix Bus can be configured with one in-port, andthis can be in-process with either a client or a server.

In a simple form as a switch, the Artix Bus gets data from an in-portand passes it to an out-port. All of the data in an incoming message canbe passed from an in-port to an out-port in one step; or the data can bepassed across in a number of parts, one or more parts at a time. Acommon way to achieve the latter is for the Artix Bus to pass onetop-level element of a message at a time from an in-port to an out-port;yet another way is for it to pass single data elements, or small groupsof these, at a time.

In one implementation, the Artix Bus makes a call to an in-port to get(to pull) one element of data, and then it makes a call on the out-portto push that data element to it. (It is obvious to one of normal skillin this area, that this data flow could be implemented in other ways,for example: the data could be passed from the in-port to the Artix Busby the former making a call on the latter; and also that the data couldbe passed from the Artix Bus to an out-port by the latter calling theformer.)

In one implementation of the Artix Bus, a pull-push model is used topass data from an in-port to the Artix Bus. The Artix Bus makes a pullcall on the in-port to get a data element; but if this is a complex dataelement then the in-port makes one or more push calls on the Artix Busto pass it the constituent data elements of the overall complex dataelement.

In one implementation of the Artix Bus, a push-pull model is used topass data from the Artix Bus to an out-port. The Artix Bus makes a pushcall on the out-port to pass it a data element; but if this is a complexdata element then the out-port makes one or more pull calls on the ArtixBus to get the constituent data elements of the overall complex dataelement.

The Artix Bus does not hold the data in a canonical format. There are anumber of reasons for this:

-   -   The Artix Bus does not have to deal with the data of full        messages as a unit, but instead it can handle the data in a        message as a number of parts, one part at a time. Data can be        passed from an in-port to the Artix Bus one part (to various        granularities) at a time (in some cases, without copying); and        from the Artix Bus to an out-port one part at a time. In these        cases, the Artix Bus does not have to define any rules for how a        set of data elements are stored to make up a message; for        example, it does not have to serialise these data elements into        an overall message. It does not have to deal with issues such as        byte alignment or padding. That is, it does not have to define a        canonical message format. This simplifies the Artix Bus and        in/out-ports.    -   The Artix Bus has no knowledge of, or concern for, the details        of how the primitive/basic and composite types are stored in        memory; that is, their bit and byte layout. It sees them simply        as programming language data values in whatever programming        language it is written in (for example, C++ or Java). This bit        and byte layout can vary between different programming language        compilers on the same underlying machine architecture, and        between different machine architectures. Even one given same        programming language compiler on one given machine architecture        could have different configurations/switches that could cause it        to use different bit and byte layouts.    -   Copies of data elements, parts of messages or even whole        messages do not always have to be made.

As mentioned previously, prior techniques that introduced conventionalcanonical formats required the use of one or more of a canonical dataformat, a canonical message format, or a canonical protocol whentranslating messages between two different middlewares. This typicallyinvolved copying a message in a first format to an intermediate messagein the canonical format and then transforming that intermediate messageto a third message in a second format. In contrast, the Artix bus doesnot have to deal with the data of a message as a contiguous unit butinstead can handle the message data as a number of parts, one part at atime.

Through the use of APIs to interact with a message, the Artix bus canpass data from an in-port to an out-port one part at a time. In thesecases, the Artix bus does not have to define rules for how a set of dataelements (or parts) are stored to make up a message or affectperformance and efficiency by making copies of the message. In otherwords, the Artix bus does not have to deal with issues, such as bytealignment or padding, for example, that are necessary if canonicalmessage formats are used.

The “parts” of data that are accessed using the appropriate APIs canhave different levels of granularities and may refer to, for example, astring, an integer, some more complex data type, or the entire message.

Thus, in one exemplary arrangement, applications utilizing the Artix busare able to accept an incoming message from an underlying middlewarewithout being concerned with the message format, data format, orprotocol used to receive the data. Instead, the application receives themessage independently of the underlying middleware and handles themessage data as a list of data objects that are individually accessiblethrough an API.

Yet another way in which Artix does not need a canonical format for datais seen in how it handles text-based data. In both the blocking andstreaming approaches, text data is encapsulated inside small objects andis accessed by in-ports, intermediaries and out-ports via APIs.Internally within these encapsulating objects, the text-based data canbe represented in different formats at different or at the same time.For example, an in-port may enter a text-based data in UTF-8 format; afirst intermediary may request it in Unicode format, at which time theencapsulating object may store the data in both UTF-8 and Unicodeformats; subsequent intermediaries and the out-port along the path canrequest the data in any format, and if the encapsulating object alreadyhas the data in the correct format then it can deliver it withoutfurther conversion. This can work for any number of encodings.

Encapsulating objects may also be configured to dispose of unwantedencodings: this is important for large text-based data, in order toavoid the wastage of memory. One method to achieve this is for anencapsulating object to be given a hint that a particular encoding isnot needed further along a given path. One way to facilitate this is foreach element of a path to list the encoding types that it needs; and fora configuration sub-system of the Artix bus to make this informationavailable to the encapsulating objects, or to make calls to theencapsulating objects to let them know what data is required and whatcan be discarded.

How Operations are Recognized

Each in-port has a listener that recognizes the arrival of incomingmessages. It is important for each to the operation of Artix that the“operation” corresponding to each message is recognized. An operation isan abstract notion, corresponding to the notion of an operation in webservice's WSDL. At a high level, an operation's identifier (e.g., itsname) (some middleware systems call this the subject, some call it themethod name) tells the recipient of a message what part of the servicethat it offers is actually being used by the client that sent themessage. At a lower level, an operation's identifier can be used todetermine what other data is contained in a message. One exampleoperation may have two integer parts (/parameters), while another mayhave many more and very complex parts.

An in-port can use a number of mechanisms for determining the operationcorresponding to an incoming message, including the following amongstothers:

-   -   Where all messages sent to a given in-port are of the same        operation, the determination is obviously trivial.    -   The size of a message may determine the operation.    -   The message header added/managed by the middleware handled by        the port may contain enough information to determine the        operation. Examples include: the operation name field of an IIOP        message; a Tuxedo service name; a SOAP/HTTP action header.    -   In some circumstances, the in-port has to parse part of the        payload of a message to determine the operation. The extent of        the parsing varies greatly from middleware to middleware. For        the Fixed data format, there is a discriminator, which is a        fixed value per operation. For XML data formatting, an XML        parser (e.g., a SAX XML parser) may be required to find the XML        element or elements that hold the operation identifier.

A minimal degree of parsing is advantageous as it speeds up thedetermination of the operation. Where parts of a message need to beparsed, the parsed data can be retained by Artix so that it is morecheaply available to code that handles the message once the operationhas been determined.

Handling a Message

In some cases, Artix can handle a message without parsing/processing thepayload; in other cases, it needs to do some degree ofparsing/processing. These are described in the following twosubsections.

No need to parse/process: pass through

Where the data and message format of a message do not need to change,Artix can pass the message from an in-port to an out-port (passingthrough any intermediaries that are configured along the path) withoutfurther parsing/processing, and in some cases without having to make acopy. This would arise for example when an in-port receives a SOAP overHTTP message and the out-port is to send this as a SOAP over MQ message.In some other cases, where the format of some data part or parts do notneed to be changed, Artix can pass that part or parts from an in-port toan out-port (passing through any intermediaries that are configuredalong the path) without any further parsing/processing, and in somecases without having to make a copy.

In some cases, Artix may not even need to know the operation that isinvolved. On the other hand, it may need to know this if, for example,this has been used to drive the middleware for the in-port or out-port(e.g., is it has be told to the out-port so that it can inform itsunderlying middleware of this).

In yet another example, Artix may also need to know the operation ifthere is some form of operation routing being carried out. For example,Artix could be configured to send certain of the operations from anin-port to out-port B, others to out-port C, and so on.

Need to Parse/Process: Type Driven Marshalling

The second approach is for Artix to parse/process the incoming message;and there are two ways in which this is done. They will be discussed inturn later in this sub-section. The both allow for the option ofchanging of the data or message format between an incoming message andthe resulting outgoing message(s).

To do this, Artix depends on canonical type definitions for the datacorresponding to each operation, and in one implementation of Artix ituses XML to represent these. The language of XML used is web service'sWSDL (Web Services Description Language). Note that Artix does not haveto use web service's data format (XML, or the XML language SOAP) totransmit or store data in a canonical format; it just needs somecanonical way of describing data.

Each type in XML is defined within the context of a name space, and weuse the format foo:A to refer to the type A defined in the name spacefoo. We sometimes use the term Qname (qualified name) to refer to a fullname such as foo:A, and we sometimes use the term simple name to referto a name such as A.

(In one implementation of Artix, we recognize that some users of Artixdo not use name spaces well, and this may result in there being morethan one type with the same simple name. In these cases, Artix can usethe WSDL URL of the Name Space plus the simple name to constitute theQname.)

Artix has a data structure that lists the operations that can be sent orreceived on each port, and, for each, the data type of each message part(the term part is used here, but the term parameter may be moreappropriate for certain uses, especially for ports to or fromoperation-based, rather than messaging-based systems) and their orderingrules. Each part's data type is recorded as a Qname, and this acts as areference to the data definition.

As stated in the introduction to this sub-section, there are two ways inwhich Artix parses the message and passes it from an in-port to anout-port. We refer to this as blocking and streaming, and they will nowbe discussed in turn.

Blocking

For each Qname, Artix has a type factory that knows how to createdata-objects for that type. In one implementation of Artix, data-objectsare of type xsd:any; and this is castable to concrete generated types.Artix has a data structure that allows it to find the correct typefactory given a Qname.

The code for a type factory can either be generated from the definitionof the corresponding type, or generic type factory code can be used(that is, shared code that is given the type definition at runtime andthen behaves much like a type factory generated from the same typedefinition would behave).

The code for a data-object can either be generated from the definitionof the corresponding type, or generic data-object code can be used (thatis, shared code that is given the type definition at runtime and thenbehaves much like a data-object generated from the same type definitionwould behave).

When the Artix Bus is informed of an incoming message, it looks up theset of the Qnames for the message parts of the incoming message

Given the Qname for each part of the message in turn, the Artix Bus asksthe corresponding type factory to create a data-object, and passes thisdata-object to the in-port so that the data-object can be populated withvalues from the incoming message. In one implementation of Artix, adata-object is created for each top-most part of the incoming message;in yet another implementation, only one data-object is created for thewhole incoming message; other implementations (and granularities) arealso possible.

Finally, after a data-object has been populated with values from theincoming message, the Artix Bus passes the data-object to an out-port sothat it can be written out.

In one implementation of Artix, the Artix Bus creates all of thedata-objects for a message, and passes these at one time to an in-portso that they can be populated. In one implementation of Artix, the ArtixBus passes all of the populated data-objects for a message to anout-port in one step, so that they can be written out.

In yet another implementation of Artix, the Artix Bus creates onedata-object at a time and, as each is created, passes it to the in-portso that it can be populated; once populated, it is passed to theout-port. In this way, the creation and formatting of the output dataand message (and possibly the use of the underlying middleware) can takeplace in parallel with the processing of the input message.

One of normal skill in this area will understand that other variationsin the sequence/ordering of creation/population/write-out are alsopossible.

Note on the Handling of Simple Types

Each in-port and each out-port knows how to handle the simple types thatit can process (e.g., some subset of those defined in the xsi namespace: e.g., xsi:int). In one implementation of Artix, this representsbetween 12 and 24 data types.

Note on the Handling of Complex Types If a data-object is created tohandle a composite type, then the in-port may recurse/call-back to theArtix Bus to complete its population. For example, if the data-object iscreated to read a composite value of type foo:A, and this contains avalue of type bar:B, then the in-port with recurse/call-back on theArtix Bus so that a data-object can be created to handle the value oftype bar:B (via the type factory for this type).

Each in-port and each output knows how to handle the “envelope” for eachcomposite type that it handles. For example:

-   -   the data-object for a simple records/structures knows that it        contains a fixed number of parts of different types. For        example, the binding to handle the SOAP data format knows that        each record/structure must have XML elements that enclose the        overall record/structure.    -   the data object for a simple sequences knows that it contains a        variable number of parts of the same type, and that there must        be some way to determine the actual number of such part in any        given sequence. For example, the binding to handle the Fixed        data format knows that there must be a “count” field that gives        the length of a sequence.

Note on Handing Type Extensions

If a message part is supposed to contain a type foo:A but in factcontains an extended type bar:E, then the following happens. Firstly adata-object of corresponding to type foo:A is created, but when thistries to populate itself via the in-port, the in-port notices that theactually type of data in the message is of type bar:E. Arecursion/call-back to the Artix runtime then occurs so that adata-object corresponding to type bar:E is created, and it is instructedto populate itself via the in-port.

Note on the Interface to Data-Objects

The interface to a data-object is in two parts. The first part containsfunctions that instruct a data-object to populate itself via an in-portand to write itself to an out-port, as discussed previously. Among otheruses, these are used by the Artix Bus to facilitate the interoperabilityof middleware.

The second part contains functions to read and write individual parts ofthe values held by a data-object. Among other uses, these functions areused by applications built on top of Artix. For example, an applicationthat wishes to send a message using Artix can use the write functions tocreate and populate a data-object that can be delivered to an out-port.The advantage of this for such an application is that it can constructand populate such a data-object and have it delivered to an out-portwithout having to be concerned with the details of the out-port, and inparticular what data format, message format and/or protocols (amongother details) that it uses to transmit the data. Using a list of thesedata-objects, an application can send a full message independently ofthe underlying middleware that is chosen to transmit it (by changing theout-port, the chosen middleware can be changed, without changing theapplication code that creates and populates the data-objects).

Another example of using these functions is for an application to beable to accept an incoming message from any underlying middlewarewithout being concerned with the data format, message format and/orprotocols (among other details) used to receive the data. An applicationcan receive a full message independently of the underlying middleware,and handle the data as a list of data-objects.

Streaming

In yet another implementation of Artix, individual data elements, orgroups of these, are passed from an in-port to an out-port. Data objectsand type factories are not created to help pass a message from anin-port to an out-port along a path.

In the streaming approach, the in-port (or intermediary) [here after,the supplier] is given a functor object on which it makes a sequence ofcalls. As the supplier parses each data element in the message or inputstream that it handles, it calls a function on the functor object. Afunctor object provides such a function for each of the data types thatit supports, normally the basic data types (e.g., some subset of thosedefined in the xsi name space: e.g., xsi:int); and also for the genericcomplex type (such as sequences, structures/records, and so on).

Where there is no intermediary on a path, an in-port would callfunctions on a functor that is processed by an out-port (in oneimplementation of Artix, the out-port itself acts as the functorobject). Artix decides creates different paths, through the in-ports andout-ports that it has configured, by giving an in-port a reference to afunctor that is handled by an out-port of Artix's choosing. For example,for one operation on in-port ip1, Artix could give ip1 a reference to afunctor handled by out-port op1; whereas for a different operation itcould give ip1 a reference to a functor handled by out-port op2. (Thisis an example of operation-based routing; Artix can handle other formsof routing, as explained elsewhere).

Where there is one or more intermediaries on a path, the in-port isgiven a functor that is handled by a first intermediary (in oneimplementation, the intermediary itself acts as the functor object);which in turn is given a functor that is handled by the nextintermediary; with the last intermediary given a functor that is handledby the out-port. Again, Artix can set up various paths, and use variousforms of routing. For example, an in-port could call functions on afunctor that is handled by a security intermediary, which in turn callsfunctions on a functor that is handled by a transformation intermediary,which in turn calls functions on a functor that is handled by anout-port.

One of the advantages of streaming is that it makes it easier toimplement certain intermediaries in certain ways. For example, if XSLTis used to transform data, it is normal to feed it an input stream, andfor it to produce an output stream. Using the blocking approach, thesequence of data objects form the in-port (or other intermediate step)would have to be transformed to a stream to feed the XSLT transformationstep, and the stream that produced by the XSLT transformation step wouldhave to be transformed back to a sequences of data objects to feed thenext intermediary or the out-port.

Another advantage is data objects do not need to be created. In somecases, the creation of these objects can cause a significant overhead(depending on issues such as: the number of objects that need to becreated; the size of these objects; the programming language being used;and so on).

Yet another advantages is that the system can handle operation payloadsthat cannot fit into memory, or cannot fit efficiently into memory.

Note on converting from streaming to blocking, and vice versa

Mapping from s to b and b to s. Why: “old” in-ports, out-ports,intermediaries.

Artix provides support for converting from streaming to blocking, andfrom blocking to streaming. The streaming to blocking converter takes astream as input (that it, it handles the calls made on a functor object)and produces a sequence of data objects. The blocking to streamingconverter takes a sequence of data objects and it makes a correspondingset of calls on a functor object.

These conversions are very useful. They allow, for example, an in-portimplemented using blocking to interface to a transformation stepimplemented using streaming (e.g., the XSLT transformation intermediarymentioned earlier). They allow, for example, a transformationintermediary implemented using streaming to feed an out-port implementedusing blocking. Similarly, mismatches between consecutive intermediariesin a path can be overcome.

In some environments, streaming is the preferred approach, but wherethere are some in-ports, out-ports and/or intermediaries alreadyimplemented using blocking, these converters between streaming andblocking allow these to be used in an environment in which the otherelements of the various paths are implemented using streaming. Inaddition, a development team may need to add an element to a path (forexample, an in-port for a particular middleware), but may not wish totake on the additional burden of writing a streaming element, insteadpreferring to implement a blocking element. To fit their new elementinto paths that already contain one or more streaming elements, they mayneed to use a converter.

Another use of the converter from streaming to blocking is to allow aserver directly implemented on top of the Artix APIs to see a blockinginterface if it wishes, even though the last part of the path to theserver is implementing using streaming.

Another use of the converter from blocking to streaming is to allow aserver directly implemented on top of the Artix APIs to see a streaminginterface if it wishes, even though the last part of the path to theserver is implementing using blocking

Yet, another use of the converter from blocking to streaming is to allowa client directly implemented on top of the Artix APIs to use a blockinginterface if it wishes, even thought the first part of the part from theclient is implementing using streaming.

Yet, another use of the converter from blocking to streaming is to allowa client directly implemented on top of the Artix APIs to use astreaming interface if it wishes, even thought the first part of thepart from the client is implementing using blocking

Some of the Optimizations Carried Out by Artix

One of the optimizations carried out by Artix is that it can passpointers/references to data from the in-port to the out-port withoutcopying the data that is pointed to. For example, if an incoming messagehas a string, then a data-object can be populated (in whole or in part)with a pointer/reference to this string rather than making a copy of thestring. The out-port, when it is given this data-object, can access theoriginal string. Avoidance of copying has a number of advantages,including: saving the effort and time required to carry this out;placing less stress on the heap (one advantages of this is that SMPmachines can scale better to the task.) Great care must be taken whenimplementing this optimization because each middleware has differentrules for when it or the code built on top of it can, will should ormust deallocate (free) memory (such as the string used in the aboveexample).

In contrast, an integration technique based on a canonical format fordata at runtime must copy the incoming data into the canonical format.This not only increases the load on the processor, but it also puts alot of stress on the heap because of all of the heap allocations andsubsequent de-allocations carried out. An integration technique thatuses XML as the canonical format would use, on the incoming side, an XMLgenerator to construct the canonical format, and an XML parser, on theout going side, to access its parts. XML generators and parsers use heapallocations for each data element/attribute, and they are also veryprocessor intensive.

Some integration techniques transmit the canonical format betweenprocesses, and possibly between machines. Artix can be assembled into asingle process, and yet its paths can be dynamically (or statically)configured.

Yet another optimization carried out by Artix is that data does notalways have to be re-formatted when being sent between middlewares.Consider an in-port, which handles a particular string in UTF-8, and anout-port that handles that string in Unicode. Choice of an inappropriatecanonical format could result in the string being converted from UTF-8to that format, and then to Unicode.

Consider instead that both the in-port and out-port handle the string inUTF-8. The in-port won't know that the string doesn't need to beconverted in this case, because it doesn't know the identify of theout-port (which could be chosen dynamically, in any case). Use of acanonical format other than UTF-8, such as Unicode, would result in twounnecessary conversions. Instead, in Artix the string ispassed—unconverted—from the in-port to the out-port (via data-objects inthe blocking approach; or via the functor object in the streamingapproach) in a way that allows the out-port to request the string inwhatever format it needs. In our example, if the out-port requests thestring in UTF-8 format then it no conversion would be required (andusing the no-copy optimization, the out-port could receive a pointer tothe original copy of the string). Only if the out-port requests thestring in another format, Unicode for example, would a conversion needto be carried out.

This technique applies not just to strings, but to all data types. It isparticularly relevant to text data because this can be large. In somecases, it is not worth applying to small binary data (such as integers).

The no-copy and no-conversion optimizations described here apply equallyto intermediary steps/elements in a path, and not just to in-ports andout-ports. A string traversing a path that has intermediarysteps/elements may still require no copying and no conversions. Copyingand conversion will be done only as required along the path. If a stringis once transformed from, say, UTF-8 to Unicode for the use by oneintermediary, the other intermediaries and the final out-port canrequest either UTF-* or Unicode without further conversion.

When working with some middleware, the Artix Bus can be informed of theoperation name, and start the Artix-controlled processing of a message,before the whole of the message has been received by the in-port (in aseries of packages/chunks or in a stream) and/or processed by thein-port. With the Artix streaming approach and some variations of theblocking approach, an in-port can still be receiving and/or processingparts of an overall message, while intermediaries and possibly theout-port along a path can be processing and even transmitting earlierparts of the same message.

Additionally, the present invention provides for an article ofmanufacture comprising computer readable program code contained withinimplementing one or more modules implementing a system integrating aplurality of middleware without canonicalization of data.

Furthermore, the present invention includes a computer programcode-based product, which is a storage medium having program code storedtherein which can be used to instruct a computer to perform any of themethods associated with the present invention. The computer storagemedium includes any of, but is not limited to, the following: CD-ROM,DVD, magnetic tape, optical disc, hard drive, floppy disk, ferroelectricmemory, flash memory, ferromagnetic memory, optical storage, chargecoupled devices, magnetic or optical cards, smart cards, EEPROM, EPROM,RAM, ROM, DRAM, SRAM, SDRAM, or any other appropriate static or dynamicmemory or data storage devices.

CONCLUSION

A system and method has been shown in the above embodiments for theeffective implementation of a data bus between middleware layers. Whilevarious preferred embodiments have been shown and described, it will beunderstood that there is no intent to limit the invention by suchdisclosure, but rather, it is intended to cover all modificationsfalling within the spirit and scope of the invention, as defined in theappended claims. For example, the present invention should not belimited by software/program, computing environment, or specificcomputing hardware.

The above enhancements are implemented in various computingenvironments. For example, the present invention may be implemented on aconventional IBM PC or equivalent, multi-nodal system (e.g., LAN) ornetworking system (e.g., Internet, WWW, wireless web). All programmingand data related thereto are stored in computer memory, static ordynamic, and may be retrieved by the user in any of: conventionalcomputer storage, display (i.e., CRT) and/or hardcopy (i.e., printed)formats. The programming of the present invention may be implemented byone of skill in the art of middleware programming.

1. A method to integrate a plurality of middleware withoutcanonicalization of data at runtime, said method implemented usingin-ports and out-ports, each in-port facilitating communication betweena specific middleware and a first interface and each out-portfacilitating communication between another specific middleware and saidfirst interface, said method comprising the steps of: (a) receivinginputs identifying at least a first and second middleware to be madeinteroperative, said interoperation implemented via at least onecommunication path between an in-port corresponding to said firstmiddleware and an out-port corresponding to said second middleware; (b)receiving an incoming message at said in-port; (c) handling saidreceived message as a plurality of parts and, for each part: identifyingan associated data type; identifying a type factory for said identifieddata type; creating at least one data-object based on said identifiedtype factory; passing said created data-object to said in-port; saidin-port populating said data-object with values from corresponding partof said message; passing said populated data object from the in-portcorresponding to said first middleware to said out-port corresponding tothe second middleware; and (d) iteratively repeating step (c) for eachpart of said message.
 2. The method of claim 1, wherein said methodfurther comprises the step of creating said type factory according tosaid associated data type.
 3. A method to integrate a plurality ofmiddleware without canonicalization of data at runtime, said methodimplemented using in-ports and out-ports, each in-port facilitatingcommunication between a specific middleware and a first interface andeach out-port facilitating communication between another specificmiddleware and said first interface, said method comprising the stepsof: (a) receiving inputs identifying at least a first and secondmiddleware to be made interoperative, said interoperation implementedvia at least one communication path between an in-port corresponding tosaid first middleware and an out-port corresponding to said secondmiddleware; (b) receiving an incoming message at said in-port; (c)handling said received message as a plurality of parts and, for eachpart: identifying an associated data type; identifying a type factoryfor said identified data type; creating at least one data-object basedon said identified type factory; passing said created data-object tosaid in-port; and said in-port populating said data-object with valuesfrom corresponding part of said message; (d) iteratively repeating step(c) for each part of said message and populating a plurality ofdata-objects; and (e) passing said plurality of populated data objectsfrom the in-port corresponding to said first middleware to said out-portcorresponding to the second middleware.
 4. The method of claim 3,wherein said method further comprises the step of creating said typefactory according to said associated data type.
 5. An article ofmanufacture comprising computer storage medium having computer readableprogram code that is executable by a computer to implement a method tointegrate a plurality of middleware without canonicalization of data atruntime, said method implemented using in-ports and out-ports, eachin-port facilitating communication between a specific middleware and afirst interface and each out-port facilitating communication betweenanother specific middleware and said first interface, said mediumcomprising: (a) computer readable program code receiving inputsidentifying at least a first and second middleware to be madeinteroperative, said interoperation implemented via at least onecommunication path between an in-port corresponding to said firstmiddleware and an out-port corresponding to said second middleware; (b)computer readable program code receiving an incoming message at saidin-port; (c) computer readable program code handling said receivedmessage as a plurality of parts and, for each part: computer readableprogram code identifying an associated data type; computer readableprogram code identifying a type factory for said identified data type;computer readable program code creating at least one data-object basedon said identified type factory; computer readable program code passingsaid created data-object to said in-port; computer readable program codepopulating said data-object with values from corresponding part of saidmessage; computer readable program code passing said populated dataobject from the in-port corresponding to said first middleware to saidout-port corresponding to the second middleware; and (d) computerreadable program code iteratively repeating step (c) for each part ofsaid message.
 6. An article of manufacture comprising computer storagemedium having computer readable program code that is executable by acomputer to implement a method to integrate a plurality of middlewarewithout canonicalization of data at runtime, said method implementedusing in-ports and out-ports, each in-port facilitating communicationbetween a specific middleware and a first interface and each out-portfacilitating communication between another specific middleware and saidfirst interface, said medium comprising: (a) computer readable programcode receiving inputs identifying at least a first and second middlewareto be made interoperative, said interoperation implemented via at leastone communication path between an in-port corresponding to said firstmiddleware and an out-port corresponding to said second middleware; (b)computer readable program code receiving an incoming message at saidin-port; (c) computer readable program code handling said receivedmessage as a plurality of parts and, for each part: computer readableprogram code identifying an associated data type; computer readableprogram code identifying a type factory for said identified data type;computer readable program code creating at least one data-object basedon said identified type factory; computer readable program code passingsaid created data-object to said in-port; and computer readable programcode populating said data-object with values from corresponding part ofsaid message; (d) computer readable program code iteratively repeatingstep (c) for each part of said message and populating a plurality ofdata-objects; and (e) computer readable program code passing saidplurality of populated data objects from the in-port corresponding tosaid first middleware to said out-port corresponding to the secondmiddleware.